618 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
Creative Commons Attribution-Non Commercial-Share Alike 4.0 International License
Size:
489608 tokens Production Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:Corpus REDEWIEDERGABE
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Annelen Brunner | Corpus REDEWIEDERGABE (Core corpus) | /N |
Documentation:
http://redewiedergabe.de/richtlinien/richtlinien.html; https://github.com/redewiedergabe
Written
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
Creative Commons Attribution-Non Commercial-Share Alike 4.0 International License
Size:
429336 tokens Production Status:
Newly created-in progress
Use:
Machine Learning
-
Paper title:Corpus REDEWIEDERGABE
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Annelen Brunner | Corpus REDEWIEDERGABE (Additional material) | /N |
Documentation:
http://redewiedergabe.de/richtlinien/richtlinien.html; https://github.com/redewiedergabe
Written
Terminology,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
Creative Commons Share Alike
Size:
1030 lexemes Production Status:
Newly created-finished
Use:
Difficulty of domain-specific German closed compounds
-
Paper title:A Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds in the Domains DIY, Cooking and Automotive
-
Paper track:Terminology/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sabine Schulte im Walde | Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
OpenSource
Size:
240000 entries Production Status:
Newly created-finished
Use:
Summarisation
-
Paper title:Summarization Corpora of Wikipedia Articles
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Dominik Frefel | GeWiki | /N |
Documentation:
Paper in English
Written
Corpus,
Language Type:
Multilingual
Languages:
Czech English German Hindi Italian Persian
Availability:
Freely Available
License:
Creative Commons - Attribution-{NonCommercial}-{ShareAlike} 4.0 International ({CC} {BY}-{NC}-{SA} 4.0)
Size:
162M sentences Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:LSCP: Enhanced Large Scale Colloquial Persian Language Understanding
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mahdi Bohlouli | Large-Scale Colloquial Persian 0.5 | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English Finnish French German Russian Swedish
Availability:
Freely Available
License:
CC - BY - NC
Size:
2 GByte Production Status:
Existing-used
Use:
Textual Entailment and Paraphrasing
-
Paper title:Comparative Study of Sentence Embeddings for Contextual Paraphrasing
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Louisa Pragst | Opusparcus | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
From Owner
License:
Size:
323 minutes Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Multi-Staged Cross-Lingual Acoustic Model Adaption for Robust Speech Recognition in Real-World Applications - A Case Study on German Oral History Interviews
-
Paper track:Speech/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Michael Gref | Difficult Speech Corpus (DiSCo) | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
From Owner
License:
Size:
1005 hours Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Multi-Staged Cross-Lingual Acoustic Model Adaption for Robust Speech Recognition in Real-World Applications - A Case Study on German Oral History Interviews
-
Paper track:Speech/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Michael Gref | GER-TV1000h | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
German
Availability:
Freely Available
License:
CreativeCommons
Size:
125KB, 1904+296 sentences Production Status:
Newly created-finished
Use:
Opinion Mining/Sentiment Analysis
-
Paper title:Doctor Who? Framing Through Names and Titles in German
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Esther van den Berg | Twitter Datasets | /N |
Documentation:
English README.txt accompanying dataset
Written
Treebank,
Language Type:
Monolingual
Languages:
Afrikaans Akkadian Amharic Ancient Greek Arabic Armenian Assyrian Bambara Basque Belarusian Bhojpuri Breton Bulgarian Buryat Cantonese Catalan Chinese Classical Chinese Coptic Croatian Czech Danish Dutch English Erzya Estonian Faroese Finnish French Galician German Gothic Greek Hebrew Hindi Hindi English Hungarian Indonesian Irish Italian Japanese Karelian Kazakh Komi Permyak Komi Zyrian Korean Kurmanji Latin Latvian Lithuanian Livvi Maltese Marathi Mbya Guarani Moksha Naija North Sami Norwegian Old Church Slavonic Old French Old Russian Persian Polish Portuguese Romanian Russian Sanskrit Scottish Gaelic Serbian Skolt Sami Slovak Slovenian Spanish Swedish Swedish Sign Language Swiss German Tagalog Tamil Telugu Thai Turkish Ukrainian Upper Sorbian Urdu Uyghur Vietnamese Warlpiri Welsh Wolof Yoruba
Availability:
Freely Available
License:
Various
Size:
25 million words Production Status:
Existing-updated
Use:
Parsing and Tagging
-
Paper title:Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Joakim Nivre | Universal Dependencies | /N |
Documentation:
https://universaldependencies.org




